Transition Point Dynamic Programming
نویسندگان
چکیده
Transition point dynamic programming (TPDP) is a memorybased, reinforcement learning, direct dynamic programming approach to adaptive optimal control that can reduce the learning time and memory usage required for the control of continuous stochastic dynamic systems. TPDP does so by determining an ideal set of transition points (TPs) which specify only the control action changes necessary for optimal control. TPDP converges to an ideal TP set by using a variation of Q-Iearning to assess the merits of adding, swapping and removing TPs from states throughout the state space. When applied to a race track problem, TPDP learned the optimal control policy much sooner than conventional Q-Iearning, and was able to do so using less memory.
منابع مشابه
An Optimal Tax Relief Policy with Aligning Markov Chain and Dynamic Programming Approach
Abstract In this paper, Markov chain and dynamic programming were used to represent a suitable pattern for tax relief and tax evasion decrease based on tax earnings in Iran from 2005 to 2009. Results, by applying this model, showed that tax evasion were 6714 billion Rials**. With 4% relief to tax payers and by calculating present value of the received tax, it was reduced to 3108 billion Rials. ...
متن کاملRobust inter and intra-cell layouts design model dealing with stochastic dynamic problems
In this paper, a novel quadratic assignment-based mathematical model is developed for concurrent design of robust inter and intra-cell layouts in dynamic stochastic environments of manufacturing systems. In the proposed model, in addition to considering time value of money, the product demands are presumed to be dependent normally distributed random variables with known expectation, variance, a...
متن کاملA Method for Solving Convex Quadratic Programming Problems Based on Differential-algebraic equations
In this paper, a new model based on differential-algebraic equations(DAEs) for solving convex quadratic programming(CQP) problems is proposed. It is proved that the new approach is guaranteed to generate optimal solutions for this class of optimization problems. This paper also shows that the conventional interior point methods for solving (CQP) problems can be viewed as a special case of the n...
متن کاملThe Stochastic Lake Game: A Numerical Solution
In this paper, we numerically solve a stochastic dynamic programming problem for the solution of a stochastic dynamic game for which there is a potential function. The players select a mean level of control. The state transition dynamics is a function of the current state of the system and a multiplicative noise factor on the control variables of the players. The particular application is to la...
متن کاملModelface: an application programming interface (API) for homology modeling studies using Modeller software
An interactive application, Modelface, was presented for Modeller software based on windows platform. The application is able to run all steps of homology modeling including pdb to fasta generation, running clustal, model building and loop refinement. Other modules of modeler including energy calculation, energy minimization and the ability to make single point mutations in the PDB structures a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1993